Lexical Coverage Issues for Speech Recognition in Indian Languages∗
نویسنده
چکیده
This report investigates issues of lexical coverage in Indian languages. More specifically, a parallel analysis of Out-of-Vocabulary words is made in Telugu and Tamil. Although generic, this study is focussed on understanding the morphological aspects in these languages as necessary for speech recognition. The observations reveal that morphological analysis and preprocessing can increase the lexical coverage by over 50%, thereby bringing them closer to the numbers in English.
منابع مشابه
Speech Recognition of European Languages
A basic overview is presented of the main ongoing efforts in large vocabulary, continuous speech recognition (LVCSR) for European languages. We address issues in acoustic modeling, lexical representation, and language modeling for several European languages, as well as issues in comparative evaluation.
متن کاملThe Relationship between Syntactic and Lexical Complexity in Speech Monologues of EFL Learners
: This study aims to explore the relationship between syntactic and lexical complexity and also the relationship between different aspects of lexical complexity. To this end, speech monologs of 35 Iranian high-intermediate learners of English on three different tasks (i.e. argumentation, description, and narration) were analyzed for correlations between one measure of sy...
متن کاملVoice InputlOutput Systems for Indian Languages
In this paper an overview of problems and prospects of voice input/output to a computer are discussed. Current attempts to provide speech input/output facilities to a computer are described. The scope of speech recognition problem is defined. Issues involved in the design of text-to-speech and speech-to-text systems are discussed. Since any sophisticated voice input/output system uses several l...
متن کاملMultilingual Speech Recognition for Information Retrieval in Indian Context
This paper analyzes various issues in building a HMM based multilingual speech recognizer for Indian languages. The system is originally designed for Hindi and Tamil languages and adapted to incorporate Indian accented English. Language-specific characteristics in speech recognition framework are highlighted. The recognizer is embedded in information retrieval applications and hence several iss...
متن کاملA Lexical Knowledge Driven Manner Based Speech Recognition Model
The emergence of speech as a more direct means of interaction with the computers, promises the possibilities of leap-frogging into development of tomorrow’s interfaces in Indian Languages for IT to Mass. Speech recognition is one of the most challenging speech technologies, for the development of speech mode man-to-machine communication. Speech recognition is the process of converting an acoust...
متن کامل